The SDQZ-Pipe Technique: Doing More With Less, Faster and Cheaper Reading UNIX-Compressed SAS Data Sets Directly in a Data Step

نویسنده

  • Randy Hirscher
چکیده

While hard disk space is amazingly "cheap" nowadays, many of us programmers continue to find ourselves working in environments that have significant hard disk constraints. The ’always-ever-increasing-in-size’ data set too quickly eats up the hard disk "real estate" available to us which will give us immediate and direct access to our data. How many of us have found ourselves spending an inordinate amount of time on trivial, mundane, and unwelcomed tasks such as temporarily compressing or moving our data offline, slicing-and-dicing it, or making some other sort of concession with respect to our data just because there was not enough hard disk space for what we needed to do? As a result, in order to save disk space, programmers working in a UNIX environment will typically store SAS data sets in UNIX-compressed format. Unfortunately, SAS is unable to directly read a UNIX-compressed, random access, SAS data set. The typical “INFILE...PIPE” approach does not work with SAS random access (ssd01) files. Programmers are instead forced to uncompress SAS data sets in an entirely separate step prior to reading the data in SAS. The technique presented in this paper however reduces the processing of a UNIX-compressed SAS data set to one step using named pipes and the TAPE sequential engine and, for this reason, it should prove to be a valuable tool for any programmer’s toolbox.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Another Factorial File Compression Experiment Using SAS

Continuing experimental work on SAS data set compression presented at NESUG in 2004, I designed another two-factor factorial experiment. My first factor compares the three kinds of data set compression offered by SAS on UNIX; the SAS DATA set OPTIONS COMPRESS=CHAR, COMPRESS=BINARY and SAS sequential format files created with the V9TAPE engine, three UNIX file compression algorithms; compress, g...

متن کامل

196-2007: Pipes and Threads: Performance Testing of Advanced Scalability Features in SAS®9

SAS v9 provides many new features of interest to users with large data sets. Many of the commonly used procedures are multi-treaded. This allows users to harness the full power of multi-processor architectures to enhance the efficiency of some tasks. In addition, SAS/Connect allows users to spawn multiple independent processes to execute custom parallel processing solutions. SAS/Connect also en...

متن کامل

Optimizing System Performance by Monitoring UNIX Server with SAS®

To optimize system performance and maximize productivity, UNIX performance and resource usage should be monitored. The monitoring is often done manually by sporadic checking using commands such as ‘top’, ‘ps’. A utility has been developed using SAS that monitors and identifies extremely resource-consuming processes, and sends e-mail to notify the owner of the processes. The utility is automatic...

متن کامل

Location of compressed natural gas stations using multi-objective flow refueling location model in the two-way highways: A case study in Iran

Increasing the use of fossil fuels is with severe environmental and economic problems, bringing more attention to alternative fuels. The compressed natural gas (CNG), as an alternative fuel, offers many more benefits than gasoline or diesel fuel such as cost-effectiveness, lower pollution, better performance, and lower maintenance costs. Gas stations location and the number of gas stations are ...

متن کامل

The Utah Raster Toolkit

The Utah Raster Toolkit is a set of programs for manipulating and composing raster images. These tools are based on the Unix concepts of pipes and filters, and operate on images in much the same way as the standard Unix tools operate on textual data. The Toolkit uses a special run length encoding (RLE) format for storing images and interfacing between the various programs. This reduces the disk...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999